Ensure single reader and writer to system fd on Unix by ysbaddaden · Pull Request #16209 · crystal-lang/crystal

ysbaddaden · 2025-10-14T09:31:11Z

This patch extends the fdlock to serialize reads and writes by extending the reference counted lock with a read lock and a write lock, so taking a reference and locking acts as a single operation instead of two (1. acquire/release the lock; 2. take/return a reference). This avoids a race condition in the polling event loops:

Fiber 1 then Fiber 2 try to read from fd;
Since fd isn't ready, both fibers start waiting;
When fd becomes ready then Fiber 1 is resumed;
Fiber 1 doesn't read everything and returns;
Since events are edge-triggered, Fiber 2 won't be resumed!!!

With the read lock, fiber 2 will wait on the lock then be resumed by fiber 1 when it returns. A concrete example is multiple fibers waiting to accept on a socket where fiber 1 would keep handling connections, while fiber 2 sits idle.

The other benefit is that it can help to simplify the evloops that will now only deal with a single reader + single writer per IO and is required for the io_uring evloop (the MT version requires it).

NOTE: While this patch only serializes reads/writes on UNIX at the Crystal::System, which is where the bugs are, we ~~may want to move~~ will move it into stdlib for all targets ~~at some point, for example to serialize reads and writes around IO::Buffered.~~ See #16289 (comment)

Depends on #16288 and #16289.
Required by #16264.

ysbaddaden · 2025-10-14T16:17:19Z

I split the fdlock in two different commits (refcount then serial R/W) that outline the different steps for merging as individual PRs.

Serializes reads and writes so we can assume any IO object will only have at most one read op and one write op. The benefits are: 1. it avoids a race condition in the polling event loops: - Fiber 1 then Fiber 2 try to read from fd; - Since fd isn't ready so both are waiting; - When fd becomes ready then Fiber 1 is resumed; - Fiber 1 doesn't read everything and returns; - Fiber 2 won't be resumed because events are edge-triggered; 2. we can simplify the UNIX event loops (epoll, kqueue, io_uring) that are guaranteed to only have at most one reader and one writer at any time.

ysbaddaden · 2025-11-27T15:14:21Z

Rebased from master to bring #16288 and #16289 that this patch depends on.

ysbaddaden · 2025-11-27T15:20:32Z

Usages are still restricted to Crystal::System types on UNIX because trying to move it out requires to implement IOCP#shutdown that itself needs the single reader and single writer locks that this PR brings (chicken-egg issue 🐣) for reasons explained in #16289 (comment).

I'll prepare a third PR that will:

Implement Crystal::EventLoop::IOCP#shutdown(Socket);
Implement Crystal::EventLoop::IOCP#shutdown(IO::FileDescriptor);
Move @fd_lock to IO::FileDescriptor and Socket.

straight-shoota

🎉

Implements an event loop that leverages **io_uring** on Linux targets. ### Requirements The event loop requires different features that have been added in different versions of the kernel. At a minimum Linux 5.19 is required, while the recent Linux 6.13 is recommended. It is thus compatible with Linux 6.1 SLTS but not previous (S)LTS kernels. The io_uring event loop is disabled by default. It must be enabled manually at compile time with the `-Devloop=io_uring` flag. The SQPOLL feature is support but disabled by default. It allows to avoid syscalls on submissions & completions which is very cool... but it [uses _lots_ of CPU](https://unixism.net/loti/tutorial/sq_poll.html) 🔥. It can be enabled at compile time with the `IORING_SQ_THREAD_IDLE` environment variable (in milliseconds) that sets the idle time for the SQPOLL thread. For example: ```crystal export IORING_SQ_THREAD_IDLE=200 crystal build app.cr -Devloop=io_uring ``` ### Implementation details The basic implementation was straightforward. It's basically an async framework: submit an operation, suspend the fiber, and resume it when the operation has completed. This is also the second event loop that uses blocking IO after IOCP on Windows, and the first one on UNIX. The main issue is a Linux limitation where close doesn't interrupt pending operations in the kernel, so we must shutdown sockets and cancel pending ops on files for example. ### Threads Support & Safety The MT safe implementation (preview_mt, execution_context) was much more complex. Unlike the other event loops, we can't have a single ring as it would require to lock on every submit, and with multiple threads it would create a contention and would likely require syscalls (that would defeat the point), so we need a ring per thread (sharing the same kernel resources). There's thus a new API to register execution context schedulers to the event loop, so we can create/close rings as needed. Since a scheduler can shutdown (e.g. after a resize down), the execution context must also drain its ring before the scheduler can stop: all the pending operations must have completed and all the pending fibers be enqueued. We need cross rings communication for a couple scenarios: to interrupt a thread waiting on the event loop, and for cancelling pending read/write file operations (the serial R/W of #16209 is required). At worst, this communication needs a lock on submit (which is avoided on Linux 6.13+). Unlike the single ring, the lock should usually not be contented in practice (unless you open lots of files, read/write from many fibers to the same file and close from whatever fiber). Unlike the other event loops, there isn't a single system instance for the whole event loop (e.g. one epoll, kqueue or IOCP), and each scheduler is responsible for its own completion queue... which means that we're back into the "a busy thread can block runnable fibers" in its completion queue while there might be starving threads. A busy thread can be a CPU bound fiber, or a pair of fibers that keep re-enqueue each other. To avoid this situation, once in a while + every time a scheduler would wait on the event loop (starving), the event loop will instead iterate the completion rings and try to steal runnable fibers from other threads. That requires a lock on the completion queue, that should also usually not be contended (only once in a while).

ysbaddaden self-assigned this Oct 14, 2025

ysbaddaden added kind:feature topic:stdlib:runtime topic:multithreading labels Oct 14, 2025

github-project-automation bot added this to Multi-threading Oct 14, 2025

github-project-automation bot moved this to Review in Multi-threading Oct 14, 2025

ysbaddaden mentioned this pull request Oct 14, 2025

Interpreter segfault when resuming fiber from nilable ivar #16210

Closed

ysbaddaden force-pushed the feature/add-crystal-fd-lock-with-refcount-and-serial-rw branch from 98d7872 to 904dd95 Compare October 14, 2025 16:14

ysbaddaden commented Oct 14, 2025

View reviewed changes

Comment thread src/crystal/fd_lock.cr Outdated

ysbaddaden commented Oct 14, 2025

View reviewed changes

Comment thread src/crystal/fd_lock.cr Outdated

ysbaddaden added a commit to ysbaddaden/crystal that referenced this pull request Oct 18, 2025

Fix: closing fd is thread unsafe on UNIX targets (crystal-lang#16209)

d8b95a2

ysbaddaden added a commit to ysbaddaden/crystal that referenced this pull request Oct 24, 2025

Fix: closing fd is thread unsafe on UNIX targets (crystal-lang#16209)

e9a0c0f

ysbaddaden mentioned this pull request Oct 24, 2025

Add io_uring event loop (linux) #16264

Merged

5 tasks

ysbaddaden force-pushed the feature/add-crystal-fd-lock-with-refcount-and-serial-rw branch from 904dd95 to 6be2dd7 Compare October 28, 2025 16:16

This was referenced Oct 28, 2025

Extract Crystal::EventLoop#shutdown from #close #16288

Merged

Fix: closing system fd is thread unsafe #16289

Merged

ysbaddaden added the platform:unix label Oct 28, 2025

ysbaddaden changed the title ~~Fix: closing fd is thread unsafe on UNIX targets~~ UNIX: ensure single reader and writer to system fd Oct 28, 2025

ysbaddaden force-pushed the feature/add-crystal-fd-lock-with-refcount-and-serial-rw branch from 6be2dd7 to b92814a Compare October 30, 2025 17:39

ysbaddaden mentioned this pull request Nov 14, 2025

Crystal's duplicated stdios are broken #16353

Open

ysbaddaden mentioned this pull request Nov 25, 2025

ThreadLocalValue is not fiber-aware #15088

Open

ysbaddaden added 3 commits November 27, 2025 16:11

Use Crystal::FdLock#read and #write in IO::FileDescriptor and Socket

0e30133

Add woken waiters to the head of the queue

72507a7

ysbaddaden force-pushed the feature/add-crystal-fd-lock-with-refcount-and-serial-rw branch from b92814a to 72507a7 Compare November 27, 2025 15:12

ysbaddaden marked this pull request as ready for review November 27, 2025 15:13

straight-shoota approved these changes Nov 27, 2025

View reviewed changes

Comment thread src/crystal/fd_lock.cr Outdated

Comment thread spec/std/crystal/fd_lock_spec.cr Outdated

ysbaddaden added 2 commits November 27, 2025 17:53

spec: force concurrency/contention while hoolding the lock

30e1f0e

Fix: rework docs + force some concurrency in spec (review)

d68ede4

straight-shoota approved these changes Nov 27, 2025

View reviewed changes

straight-shoota added this to the 1.19.0 milestone Nov 27, 2025

straight-shoota merged commit 7bdbd04 into crystal-lang:master Dec 1, 2025
49 checks passed

github-project-automation bot moved this from Review to Done in Multi-threading Dec 1, 2025

straight-shoota changed the title ~~UNIX: ensure single reader and writer to system fd~~ Ensure single reader and writer to system fd on Unix Dec 1, 2025

ysbaddaden deleted the feature/add-crystal-fd-lock-with-refcount-and-serial-rw branch December 1, 2025 11:58

ysbaddaden mentioned this pull request Jan 15, 2026

Use FdLock on all targets #16569

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Ensure single reader and writer to system fd on Unix#16209

Ensure single reader and writer to system fd on Unix#16209
straight-shoota merged 5 commits intocrystal-lang:masterfrom
ysbaddaden:feature/add-crystal-fd-lock-with-refcount-and-serial-rw

ysbaddaden commented Oct 14, 2025 •

edited

Loading

Uh oh!

ysbaddaden commented Oct 14, 2025

Uh oh!

Uh oh!

Uh oh!

ysbaddaden commented Nov 27, 2025

Uh oh!

ysbaddaden commented Nov 27, 2025 •

edited

Loading

Uh oh!

straight-shoota left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

ysbaddaden commented Oct 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ysbaddaden commented Oct 14, 2025

Uh oh!

Uh oh!

Uh oh!

ysbaddaden commented Nov 27, 2025

Uh oh!

ysbaddaden commented Nov 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

straight-shoota left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ysbaddaden commented Oct 14, 2025 •

edited

Loading

ysbaddaden commented Nov 27, 2025 •

edited

Loading